Given a bipartite graph $G = (V_1,V_2,E)$ where edges take on {\it both}positive and negative weights from set $\mathcal{S}$, the {\it maximum weightededge biclique} problem, or $\mathcal{S}$-MWEB for short, asks to find abipartite subgraph whose sum of edge weights is maximized. This problem hasvarious applications in bioinformatics, machine learning and databases and its(in)approximability remains open. In this paper, we show that for a wide rangeof choices of $\mathcal{S}$, specifically when $| \frac{\min\mathcal{S}} {\max\mathcal{S}} | \in \Omega(\eta^{\delta-1/2}) \cap O(\eta^{1/2-\delta})$ (where$\eta = \max\{|V_1|, |V_2|\}$, and $\delta \in (0,1/2]$), no polynomial timealgorithm can approximate $\mathcal{S}$-MWEB within a factor of $n^{\epsilon}$for some $\epsilon > 0$ unless $\mathsf{RP = NP}$. This hardness result givesjustification of the heuristic approaches adopted for various applied problemsin the aforementioned areas, and indicates that good approximation algorithmsare unlikely to exist. Specifically, we give two applications by showing that:1) finding statistically significant biclusters in the SAMBA model, proposed in\cite{Tan02} for the analysis of microarray data, is$n^{\epsilon}$-inapproximable; and 2) no polynomial time algorithm exists forthe Minimum Description Length with Holes problem \cite{Bu05} unless$\mathsf{RP=NP}$.
展开▼
机译:给定二部图$ G =(V_1,V_2,E)$,其中边集{\ it {sit},即最大权重边缘二斜度问题,或$简写为\ mathcal {S} $-MWEB,要求查找边缘权重之和最大的无向子图。这个问题在生物信息学,机器学习和数据库中有各种应用,其(in)近似性仍然是开放的。在本文中,我们显示了$ \ mathcal {S} $的广泛选择,特别是当$ | \ frac {\ min \ mathcal {S}} {\ max \ mathcal {S}} | \ in \ Omega(\ eta ^ {\ delta-1 / 2})\ cap O(\ eta ^ {1 / 2- \ delta})$(其中$ \ eta = \ max \ {| V_1 |,| V_2 | \} $和$ \ delta \ in(0,1 / 2] $),对于某些$,没有多项式时间算法可以在$ n ^ {\ epsilon} $的因子内近似$ \ mathcal {S} $-MWEB \ epsilon> 0 $除非$ \ mathsf {RP = NP} $。该硬度结果证明了在上述区域内针对各种应用问题采用的启发式方法的合理性,并表明不太可能存在良好的近似算法。表明:1)在\ tan {Tan02}中提出的用于分析微阵列数据的SAMBA模型中发现具有统计意义的二元组是不可近似的。 2)对于带有孔的最小描述长度问题,除非存在$ mathsf {RP = NP} $,否则不存在多项式时间算法。
展开▼